May 93 - Porting to OODLs
Porting to OODLs
Mikel Evins
At some point a programmer interested in object-oriented dynamic languages will ask,
"why not implement this application in an OODL?" In this article we consider some of
the costs and benefits of OODL development and discuss how to decide whether to port
your application. The primary reason for considering a port to an object-oriented
dynamic language is that you expect development and maintenance to be easier. Most
likely your customers, given two implementations of the same software with the same
apparent characteristics, will not care what language was used to develop it.
In deciding whether to port to an OODL (or to use one to develop a new program) you
must consider various costs and benefits. Briefly, you must assess whether the gains
in development and maintainability appropriately balance any costs associated with the
OODL you are considering. Different object-oriented dynamic environments impose
varying costs and confer varying benefits.
There are certain benefits that OODLs' advocates most often stress when comparing
them to more traditional languages. One such benefit is superior expressive power;
practically speaking, superior expressive power means the ability to express more
computation in less code. Another commonly cited benefit is automatic memory
management: dynamically allocated memory is both allocated and freed automatically,
without the programmer's intervention. Advanced development environments, which
provide numerous services to improve programmer productivity, are also high on the
list of OODL benefits. Finally, a favorite topic of OODL advocates, one whose benefits
are perhaps difficult to adequately convey, is the runtime flexibility afforded by
dynamic typing and introspective functions. Each of these benefits imposes certain
costs, and so critics of OODLs sometimes argue that they are not necessarily beneficial.
Let's consider each feature separately.
Expressive power
Expressive power is a measure of the power of a programming language to express
computational work clearly, succinctly, and generally. A language with great
expressive power is one in which it is easy to describe an algorithm abstractly and in
terms that are easy to understand. Object-oriented dynamic languages are designed
specifically with expressive power in mind. It is often easier in OODLs than in more
conventional languages to express general algorithms, such as a function that sorts any
type of collection using an ordering test passed as a parameter, or an intersection test
that accepts any two geometric objects, or a general pattern-matcher. An argument
against such expressive power is that the high degree of abstraction inherent in highly
expressive languages can conceal the computational cost of an operation, and it is true
that a programmer needs a good understanding the costs of a language's abstract
operations in order to appropriately optimize a program. In high-level languages
these costs are not always obvious.
Automatic memory management
OODLs usually incorporate automatic memory management schemes, often referred to
as garbage-collection, or GC. Garbage-collection makes the programmer's life much
easier: so much so that it's hard to appreciate unless you have worked with it. In a
system with garbage-collection many common memory errors simply never occur;
they are impossible. For example, a program with automatic memory management
never encounters a dangling pointer and can never fail to free a block of memory that
is no longer in use.
The most common argument against garbage-collection is that it makes the life of the
programmer easier at the expense of the end user, imposing on the user the cost in
CPU time of garbage-collection overhead. In fact, empirical evidence suggests
otherwise; present-day garbage-collected systems often spend less time in
memory-management operations than do programs whose dynamic memory
management is done by hand. This unexpected result is largely because modern
garbage-collectors maintain strict allocation disciplines that make allocation and
deallocation extremely fast; the memory management system can rely on certain
invariants that it knows about to make memory operations fast. By contrast, common
library operations in conventional languages, such as C's malloc() and free(), are
normally much slower because they cannot make the same assumptions about the
layout of the heap that a garbage-collected system can. For example, a system with
stop-and-copy garbage collection can allocate an object by simply incrementing an
index into free space. Most malloc() implementations must conduct a search of free
areas to find one into which the request block can fit.
There are other costs associated with automatic memory management. Perhaps the
most notorious is the gc-pause, or gc-wait, so called because the garbage-collector
must interrupt processing while it runs, to ensure that the heap is stable while it
frees unused memory. With a well-designed garbage-collector such pauses should be
very brief, and it is usually possible to schedule collections at times the user is
unlikely to notice. In some systems you can use an incremental garbage collector that
distributes collection operations over the continuous operation of the program, though
such collectors tend to be less efficient overall than those that pause.
One issue that needs careful attention is the use of garbage-collected OODLs in
real-time systems: you must be sure that you can guarantee response time, and that
means knowing when collection will happen and how long it will take in the worst case.
There are techniques for controlling garbage-collection in such environments,
however. The easiest among them is that you ensure that no dynamic allocation takes
place during a critical loop. You might, for example, preallocate data structures that
get partially filled during time-critical operations, and execute garbage collection
explicitly at safe times.
Advanced development environments
Object-oriented and dynamic languages, especially Smalltalk and Lisp, are widely
known for the rich development environments associated with them. Of course, it is
one thing to praise a development environment and quite another to praise a
programming language, but it is no accident that some of the best work in development
environments has been done in these languages. Smalltalk and Lisp share certain
features that have facilitated the creation of their environments. Both languages are
designed for incremental, interactive development. Both languages can treat source
code as program data. Both languages can be extended by defining new facilities that
become part of the development system's runtime environment. Both include
introspective features with which programmers can interactively examine program
data.
In short, both Smalltalk and Lisp provide good support for the development of
programming tools. It is therefore no surprise that some of the earliest advances in
such tools should have come from Smalltalk and Lisp programmers. We have inherited
from those programmers inventions such as interactive tracing and stepping,
inspectors for examining runtime data, cross-referencing of source changes and
dependencies, graphic display of class hierarchies and call trees, debuggers that
support inspecting and changing variables on the stack and which can restart a halted
computation from a user-selected stack-frame, and so on. These facilities came to
exist in large part because Smalltalk and Lisp programmers realized that they could
easily implement them; they didn't have to wait for someone else to design a separate
utility program that could read their source files. Instead, they could make small
extensions to their development systems' runtimes, accumulating and improving those
changes over time until they had developed facilities of great power.
All programmers have benefited from these inventions; they are gradually becoming
standard parts of the development systems for conventional languages. OODL
programmers are not standing still, however, and their languages still have the
built-in support that encourages good tool development.
The cost of these many useful tools has historically been that they are so tightly bound
to the language runtime that it is impossible to separate the tools from the application.
As a result, programs developed in Smalltalk and Lisp have traditionally been very
large because they included with them the entire development system. This close
integration of the application with the development system is actually beneficial for
some in-house developers because they can easily examine a running application to
determine the cause of a program error. It is clearly inappropriate, however, for
most commercial development. OODL designers have realized that they must support a
delivery model that is more in line with the needs of commercial developers and are
beginning to release OODL products that separate the development environment from
the application. When you are considering whether to move to OODL development you
need to find out the minimum size of an application developed with the systems you are
evaluating.
Flexibility
Perhaps hardest to explain of the commonly described benefits of OODL development,
the flexibility of an object-oriented dynamic language is nevertheless one of its most
appealing features. OODL flexibility is made up of equal parts of the other listed
benefits; it grows out of the synergy among expressive power, automatic memory
management, and an advanced development environment. Because an OODL includes a
library of utility classes, and because the interface to the library is defined using
very powerful, general abstractions, and because the development environment is
designed to support fast interactive development, you can quickly and easily get a data
structure built and a piece of code running. Because you can call all of your routines
interactively, using data structures that you can build interactively, you can quickly
and easily test your designs. Because of the fast turnaround time of an interactive
development environment and the power of abstract, polymorphic protocols, you can
switch data representations quickly and easily. Because you can run your program, or
just subsystems or even individual routines, interactively, without ever leaving the
development system, you can find an error quickly and use the debuggers, steppers,
inspectors, and other tools to identify the exact nature of the problem. Once you have
found a problem the incremental compiler and interactive environment make it easy to
correct the problem and test the change.
The hidden cost of this development flexibility is that it can play upon the
programmer's love of new features and tempt us to do more than we should: to add more
features because it's easy, to try to generalize algorithms beyond reasonable utility
because we can, to add ever more elaborate programming utilities because the
environment supports them, and so on. Less flexible environments impose a sort of
discipline, if only a crude one. With a more flexible and powerful development system
more of the discipline lies with the programmer.
Aside from the matter of whether the benefits are in themselves costly, there are
other costs associated with OODLs. The most obvious is that a programmer who
switches to an OODL must learn the language and its idioms. Regardless of the real
benefits to be had from OODL development, you will need to consider the time it will
take for programmers to become familiar with a new language and development
environment before deciding that a change is appropriate. Most good programmers can
adapt to the superficial differences of syntax and a new user interface in a week or
two, but you should expect new OODL programmers to be adjusting to dynamic features
and programming idioms for some months. It can be extremely helpful to have at least
one contributor, respected by the programming team, with solid experience in
dynamic language development.
Aside from the cost of changing environments, each OODL imposes some minimum RAM
requirement because of the runtime system that is part of every application. That
minimum size varies from one language implementation to another. Some products,
such as MacScheme, Object Logo, and Prograph, generate applications as small as 100
to 500 Kbytes and require perhaps 200K to 1 Megabyte to run. Others, such as
various Common Lisp and Smalltalk products demand a megabyte or more of disk space
and anywhere up to four or five megabytes of RAM. In defense of OODLs with large
minimum sizes, we can say that adding application features usually increases the
application size only slowly because built-in library code provides so much
functionality and is nearly always designed with general reusability in mind.
Frequently the growth curve as features are added is significantly flatter in an OODL
application than an equivalent application written in a more conventional language. For
a given OODL there is usually a level of application complexity at which a C or Pascal
implementation equals or exceeds the size of the same application written in the OODL.
Before deciding whether to port your application to an OODL be sure to weigh the gains
to be had from the change of language against the costs associated with various
candidates, the cost of converting, and the costs of development and delivery in the
chosen language.
When and why to port
How will you decide whether to port development to an object-oriented dynamic
language? Not every project would benefit from porting. Let's take a look at several
considerations that might make OODL development an attractive option.
Complex data management problems
One of the best reasons to switch to an object-oriented dynamic language is that you
need to support complicated, dynamically-managed data structures. OODLs share a high
level of support for the procedural and data abstractions necessary to manage such
structures. For example, Smalltalk, CLOS, Dylan, Self and Prograph all have good
support for complex data abstractions and models of data that rely on abstract objects,
not on memory addresses and explicit representations. Because most OODLs provide
automatic memory management, they reduce memory management problems to those
associated with choosing appropriate data structures and ensuring they are populated
correctly; all issues of disposal are eliminated. The more significant your dynamic
memory management problems (that is, the more complex your dynamic data
structures), the more you are likely to gain from OODL development.
Robustness a high priority
OODLs can help improve an application's robustness in several ways. For example,
certain classes of memory errors are impossible in a runtime environment with
automatic memory management. Bus errors are almost unheard of in OODL
development, except among programmers who use foreign function facilities to call
code written in more traditional languages.
Many dynamic languages, such as Common Lisp, Dylan, and Smalltalk, provide library
classes for handling exceptions, greatly simplifying the task of managing errors and
other exceptional conditions. Formally defined exception-handlers encapsulate
program control so that your application can invoke condition-handlers and non-local
exits in a safe, structured way, and with much less work than building your own
exception-handling systems from scratch.
Runtime type-checking makes it easier to catch certain classes of error during
development. In statically-typed languages a program that compiles is presumed to be
free of type errors, but, in fact, runtime type errors can still occur and can be
disastrous. Dynamic languages can catch runtime type errors and, using their
exception-handling features, signal such errors to the programmer. Using a built-in
exception-handling mechanism you can implement handlers that prevent crashes even
in the presence of serious program errors and protect your users from system
crashes and data losses.
If runtime robustness, especially safety from hard crashes, is a high priority then it
might be worthwhile to consider switching to OODL development. As an example of the
trade-offs involved in choosing OODL development over traditional languages, one
developer involved in application testing reported that an application implemented in C
crashed unexpectedly under low memory conditions; the same application ported to a